Model Selection

Single-GPU Efficient Inference

# Single-GPU Efficient Inference

Jamba is a state-of-the-art hybrid SSM-Transformer architecture large language model, combining the advantages of attention mechanisms and the Mamba architecture, supporting a 256K context length, and suitable for inference on a single 80GB GPU.

Large Language Model

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase